Word Midas Powered by StringNet: Discovering Lexicogrammatical Constructions in Situ
نویسندگان
چکیده
Adult second language learners face the daunting but underappreciated task of mastering patterns of language use that are neither products of fully productive grammar rules nor frozen items to be memorized. Word Midas, a web browser extension, targets this uncharted territory of lexicogrammar by detecting multiword tokens of lexicogrammatical patterning in real time in situ within the noisy digital texts from the user’s unscripted web browsing or other digital venues. The language model powering Word Midas is StringNet, a densely cross-indexed navigable network of one billion lexicogrammatical patterns of English. These resources are described and their functionality is illustrated with a detailed scenario.
منابع مشابه
StringNet as a Computational Resource for Discovering and Investigating Linguistic Constructions
We describe and motivate the design of a lexico-grammatical knowledgebase called StringNet and illustrate its significance for research into constructional phenomena in English. StringNet consists of a massive archive of what we call hybrid n-grams. Unlike traditional n-grams, hybrid n-grams can consist of any co-occurring combination of POS tags, lexemes, and specific word forms. Further, we d...
متن کاملWord similarity using constructions as contextual features
1 We propose and implement an alternative source of contextual features for word similarity detection based on the notion of lexicogrammatical construction. On the assumption that selectional restrictions provide indicators of the semantic similarity of words attested in selected positions, we extend the notion of selection beyond that of single selecting heads to multiword constructions exerti...
متن کاملThe StringNet Lexico-Grammatical Knowledgebase and its Applications
This demo introduces a suite of web-based English lexical knowledge resources, called StringNet and StringNet Navigator (http://nav.stringnet.org), designed to provide access to the immense territory of multiword expressions that falls between what the lexical entries encode in lexicons on the one hand and what productive grammar rules cover on the other. StringNet’s content consists of 1.6 bil...
متن کاملDiscovering Light Verb Constructions and their Translations from Parallel Corpora without Word Alignment
We propose a method for joint unsupervised discovery of multiword expressions (MWEs) and their translations from parallel corpora. First, we apply independent monolingual MWE extraction in source and target languages simultaneously. Then, we calculate translation probability, association score and distributional similarity of co-occurring pairs. Finally, we rank all translations of a given MWE ...
متن کاملDiscovering and Analyzing the Intellectual Structure and Its Evolution in Core Journals of "Knowledge and Information Science" during 2004-2013
Purpose: This study aims to reveal the intellectual structure of Knowledge and Information Science and its evolution along with the review of journals subjective scope based on 6830 abstract in the ten core journal in the JCR 2013, over the ten years (2004-2013). Methodology: In this research, co-word and Correspondence analysis of 150 words -selected by tf-idf weight- were done after parametri...
متن کامل